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The recent availability of data for cities has allowed scientists to exhibit scalings which present themselves in 
the form of a power-law dependence on population of various socio-economical and structural indicators. 
We propose here a stochastic theory of urban growth which accounts for some of the observed scalings 
and we confirm these predictions on US and OECD empirical data. In particular, we show that the 
dependence on population size of the total number of miles driven daily, the total length of the road network, 
the total traffic delay, the total consumption of gasoline, the quantity of C0 2 emitted and the relation 
between area and population of cities, are all governed by a single parameter which characterizes the 
sensitivity to congestion. Our results suggest that diseconomies associated with congestion scale 
superlinearly with population size, implying that -despite polycentrism- cities whose transportation 
infrastructure rely heavily on traffic sensitive modes are unsustainable. 

he recent availability of an unprecedented amount of data has made possible quantitative studies of urban 
systems 1 3 , opening the way to a new Science of Cities. In particular, the discovery of allometric scaling 
relationships in cities has driven the quantitative research on urban systems in the past years. Indeed, there is 
a great amount of evidence that different socio-economic indicators in cities, such as the GDP, the crime rate, the 
number of patents as well as different structural indicators such as the total length of the road network, the 
urbanized land area, etc., exhibit robust scaling relationships with respect to population 410 . The existence of these 
simple scaling relationship hints at the existence of universal processes shared by urban systems, and thus at the 
possibility of modeling cities. 

A common trait shared by all complex systems -including cities- is the existence of a large variety of processes 
occuring over a wide range of time and spatial scales. The main obstacle to the understanding of these systems 
therefore resides in uncovering the hierarchy of processes and in singling out the few ones which govern their 
dynamics. Albeit difficult, the hierarchisation of processes is of prime importance. A failure to do so leads to 
models which are either too complex to give any real insight into the phenomenon, or too simple and abstract to 
have any resemblance with reality. As a matter of fact, despite numerous attempts 51115 , a theoretical understand- 
ing of many observed empirical regularities in cities is still missing. 

In the present study, we show that the spatial structure of the mobility pattern controls the behaviour of many 
quantities in urban systems. Indeed, cities are not only defined by the spatial organisation of places fulfilling 
different functions -shops, places of residence, workplaces, etc.- but also by the way individuals move among 
them. Understanding where people live, where and how they travel within the city thus appears as a necessary step 
towards a scientific theory of cities. 

Although an increasing amount of data about mobility is now available 16 , we still lack a simple model explain- 
ing the dominant mechanisms governing the formation and evolution of mobility patterns. Many factors such as 
geographical constraints, facilities location and available transportation -to name a few- can impact the mobility 
and it thus appears as an intricate issue. Here, we tackle the problem of mobility by making simplifying -yet not 
simple- assumptions, trying to grasp the most important parameters which define the problem. We thus build 
upon a simple out-of-equilibrium model previously developed 17 . This model, among other things, accounts for 
the polycentric transition of cities and gives a prediction for the number of centers as a function of population. We 
show that this framework allows us to predict the behaviour of many quantities related to mobility and the 
structure of cities: the scaling with population of the total time wasted in congestion, transport related C0 2 
emissions, total travelled distance, total lane miles and surface area. 

Our results allow us give a quantitative insight into two important debates around urban systems. First, we are 
able to discuss the benefits of polycentricity and quantify some of its aspects. Then, maybe more importantly, we 
are able to put into perspective the sustainability of urban systems. 
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Results 

Naive scalings. We start by presenting some naive arguments to 
estimate the scaling exponents for the area A, the total daily 
distance driven L tot and the total lane miles L N . Although these 
predictions turn out to be wrong, naive scalings are useful as a first 
approach to the problem as they allow us understand how the 
different quantities relate to each other. 

Surface area. First, we estimate the dependence of the area A of a city 
on its population P -a long standing problem in the field 5 . A first 
crude approach would be to assume that cities evolve in such a way 
that their population density p = PI A remains constant. This 
assumption straighforwardly implies that the area should scale line- 
arly with population 

A~X 2 P (1) 

where A 2 is the average surface occupied by each individual (the 
assumption of a constant density is then equivalent to the one of a 
constant average surface per capita). 

Total length of roads. We now estimate the total length L N of all the 
roads within a city. If we consider that the network formed by streets 
is such that all the nodes (intersections) are connected to their closest 
neighbour, the typical length of a road segment is given by 



(2) 



where N is the number of intersections. Previous studies of road 
networks in different regions, and over extended time periods 1819 , 
have shown that the number of intersections is proportional to the 
population size. Therefore, the typical length of a road segment 
(between two intersections) varies with the population size P as 



and the total length of the network L N ~ P( R should then scale as 
L N 

7a 



,Vp 



(4) 



Using the naive scaling for the dependence of A on population size 
given previously in Eq. 1 we finally get 

L N ~P (5) 



Total daily commuting distance. Individual constraint. We also 
estimate the total commuting distance L tot . The first constraint on 
this distance comes from individual's limitations and behaviour. We 
make here the simple assumption that individuals choose their res- 
idence and work place such that their total commuting distance is 
fixed (or at least smaller than a certain value) and equal on average to 
€ c . In that case, we simply have 



' const. = £ r 



(6) 



(by constant, we mean independent from the population size of the 
city). 

The city structure constraint. An additional contraint on LfQt is 
given by the structure of the city 8,25 . Indeed, the individual commut- 
ing distance is also related to the total suface area of the city and the 
location of activity centers. 

If we first assume that the city is monocentric, individuals are all 
commuting to the same center and the typical commuting distance is 
controlled by the typical size of the city of order \[~A 



Va 



(7) 



On the other hand, if we assume that the city is completely decen- 
tralized, the typical commuting distance is of order the nearest neigh- 
bour distance \f~A~ / \[P, and we obtain 



Va 



(8) 



Comparison of naive scalings with empirical results. The 

comparison of the naive exponents with the exponents measured 
on US data is shown in Table 1 (see the Methods section for 
details about the data). There are important discrepancies, which 
we discuss in the following. 

First, we note that the naive scaling for the surface area A predicts a 
value of the exponent that is quantitatively -and worse, qualitatively- 
different from that observed. Indeed, we find that for real cities 



A~P" 



(9) 



with a = 0.85. While the naive argument implies a linear dependence 
of the surface area A with population, we find a sublinear scaling in 
the data, which is a qualitatively different behavior (Table 1). This 
disagreement on this basic quantity will naturally impact the scaling 
of the other quantities. 

The data also show that L tot /P can be considered reasonably inde- 
pendent from P (with a value of approximately 23 miles for the US, 
see Fig. 1), in agreement with the individual constraint assumption 
(Eq. 6). This finding is also in agreement with the results drawn from 
census data in Germany by 20 . Although this assumption of a constant 
distance is simple and verified on the US data, we think that it 
deserves to be systematically tested on other datasets for other coun- 
tries and cities. 

Finally, the scaling of L tot j \[~A given in the extreme cases of a 

monocentric city structure and a totally decentralized city structure 
disagree with the value measured on data (see Table 1). This suggests 
that most cities have a structure that is neither completely centra- 
lized, nor totally decentralized. In particular, this result cast some 
doubts about the study 15 which assumes implicitely that cities are 
always monocentric. Any situation between the two previous 

extreme cases would give a scaling of the form L tot j VX ~ P b where 

b e [1/2, 1]. One can easily see that this expression is consistent with 
that of All 1 and L,JP if 



b=l- 



(10) 



Table 1 This table displays the value of the exponent governing 
the behavior with the population P obtained by naive arguments 
and the value obtained from empirical data. The discrepancies 
reveal the failure of the naive scaling arguments and the necessity 
to go further and model mobility patterns. The data used for this 
table can be found in 38-40 


Quantity 




Naive exponent 


Measured value 


A 

l n /Va 

L N 

L t0 ,/P 




1 

0.5 
1 

[1/2,1] 
1 


0.85 ± 0.01 1 (r 2 = 0.93) 
0.42 ± 0.02 (r 2 = 0.83) 
0.86 ± 0.02 [r 2 = 0.92) 
0.60 ± 0.03 (r 2 = 0.90) 
0.03 ± 0.02 (r 2 = 0.04) 
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Figure 1 | Constant daily driven distance per capita, (a) daily total driven distance per capita as a function of population for 441 urbanised area in the US 
in 2010. The data shown in the plot are compatible with a population-independent behaviour, (b) Histogram of the daily total driven distance per 
capita for the same cities. The average daily driven distance for these cities is 23 miles, and the standard deviation 7 miles. 



which is indeed what we observe empirically (up to error bars). This 
preliminary analysis thus leads us to the conclusion that, in order to 
compute the various exponents, we need to better describe the struc- 
ture of commuting patterns. In other words, we need to find a 
description of cities that goes beyond the naive monocentric or tot- 
ally decentralized views, and which accounts for the observed sub- 
linear scaling of the surface area A. 

Beyond naive scaling: modeling mobility patterns. We begin with 
the assumption that mobility patterns are mostly driven by the daily 
commuting and we would like to understand how an individual, given 
his household location, will choose his job location. We assume that 
this choice will be determined by two dominant factors: the expected 
wage at a given job, and the commuting time to this job's location. 
Indeed, places with high average salaries are attractive, but having to 
spend a sensible amount of time commuting every day is less 
desirable. We assume there are N c potential activity centers in the 
city, each characterized by an average wage w(j) at location j. This 
wage is endogenously determined and depends a priori on many 
factors such as agglomeration effects 21 , the type of industry, etc. 
Although it is in principle possible to write down equations to 
determine the wage (as attempted in 11 for instance), not only is it 
impossible to solve them, but also not necessarily useful. A similar 
situation arises in physics when one studies the behaviour of atoms 
made of a large number of electrons. Physicists found out 22 that, in 
fact, a statistical description of these systems relying on random 
matrices could lead to predictions which agree with experimental 
results. We would like to import this idea of replacing a complex 
quantity such as wages -which depends on so many factors and 
interactions- by a random one in spatial economics. So, we treat 
the wage as if it was exogenous and random 17 , that is we write w(J) 
= s rjj where 5 represents the typical income in this city and 17 is a 
random number chosen uniformly in [0, 1]. Furthermore, we assume 
that the commuting time does not only depend on the distance 
between the two places, but also on the traffic T« between those 
two locations. An individual living at i will thus commute to the 
center j which corresponds to the best trade-off between income 
and commuting time, thus to the center j such that the quantity 



1 + 



(id 



is maximum 17 . The quantity d t j is the euclidean distance between i and 
j (both supposed to be scattered randomly across the city), T(j) the 



total incoming traffic at j, c the capacity of the underlying transpor- 
tation network, and ji is an exponent describing the sensitivity of the 
network to congestion. The quantity € is the maximum distance that 
people can financially travel daily, defined as the ratio between the 
typical individual income and the transportation costs per unit of 
distance. 

This simple model displays a surprisingly rich behaviour 17 . In 
particular, it accounts for the monocentric to polycentric transition 
observed in most cities. It has been a well-known fact for quite some 
time that as cities grow, they evolve from a monocentric organisation 
where all the activities are concentrated in the same geographical area 
- usually the central business district- to a more distributed, poly- 
centric organisation 11,23 . Several theories in spatial economics exist 1 , 
but are not satisfactory for many reasons. Among other things, they 
do not take congestion into account and have no predictive, testable 
content 24 . Within this framework, congestion is actually responsible 
for the transition, and the number of activity centers in a city of 
population size P is on average given by 



with 



P*=c 



Van c 



l/K 



(12) 



(13) 



Using data of employment per Zip Code Area in the US 17 , we showed 
that 



fc~P a 



(14) 



where we measure a = 0.64 ± 0.12 (95% confidence interval (CI)). In 
other words, the number of centers scales sublinearly with popu- 
lation size. 

Computing the exponents. Area. At this stage, the number of 
centers is a function of population and the area 



k = F(A,P) 



(15) 



and we need an additional equation in order to get a closed system. 
Here we focus on the area and its evolution with the population size, 
which reflects the growth process of the city. In the following, we will 
investigate two different approaches. It is worth noting that both 
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Table 2 | This table displays the predicted theoretical behavior and the empirical observations versus the population size Pfor different 
quantities: L tot \s the daily total driven distance, A is the area of the city, L N \s the total length of the road network, 5z is the daily total delay 
due to congestion, Q gos is the yearly total consumption of gasoline and Q C o 2 is the total CO2 emissions emitted yearly due to transporta- 
tion. In the third (fourth) column, we show the predicted values of the exponent of Fusing the value of a (of a) measured on US data. In the 
fifth column, we show the value of the exponents directly measured on data about US and OECD cities. The measured values are in good 
agreement with the prediction. In particular, the exponents for /.^and St are consistent with our prediction that their difference should be 
1/2 

Predicted value of the exponents 



Theoretical dependence 

Quantity on P in self-consistant case self-consistent case fitting case Measured value 

L tot P 1 1 1 .03 ± 0.03 (r 2 = 0.95) 

A/e 2 /p\ 2S 2,5 = 0.64 a = 0.85 0.853 ± 0.01 1 (r 2 = 0.93) 38 - 40 



L N /l l +5=Q 82 l+« =0 93 0.765 ±0.033 (r* = 0.92)3~o 

5zh /p\* 1 + 6 = 1 .32 1 .22 1 .270 ± 0.067 (r 2 = 0.97) 38 - 40 

Q gasXXh /e /p\* 1+5= 1.32 1.22 1 .262 ± 0.089 = 0.94) 38 - 40 

1.212 ± 0.098 (r 2 = 0.83) 41 
1.33 ± 0.03 26 

Ln /^/X \/p 0.5 0.5 0.42 ± 0.02 (r 2 = 0.83) 38 - 40 

Um p( P \ ~ 3 1 - 5 = 0.68 1 - a/2 = 0.58 0.595 ± 0.026 (r 2 = 0.90) 3 



approaches give results in qualitative agreement, showing that some 
stylized facts -such as super- or sublinearity- are very robust. 

Fitting procedure. In the absence of knowledge of the processes 
responsible for urban sprawl, we can assume that the area behaves as 



A~P a 



(16) 



where a is the exponent to be determined, through fits on data. The 
empirical value for the exponent for the US data is a ~ 0.85. Once this 
exponent is given we can then compute the various exponent for the 
quantities of interest (see the following and table 2). We get for the 
number of centers k 



(17) 



which is sublinear as long as a < 2, in agreement with the empirical 
results for US cities. As we will see, this approach yields the same 
qualitative behaviours as those predicted with the method of the next 
section. In other words, even if the main mechanism behind urban 
sprawl is not congestion, the conclusions of this paper are not affec- 
ted as long as the area scales sublinearly with population. 

Coherent growth. Let us now assume that the scaling of A with 
population is determined by the number of activity centers and the 
constant commuting length of individuals. This means that the 
growth of the area is controlled by the appearance of new activity 
centers, if we assume that a city is organized around k activity centers 
and that the attraction basin of each of these centers are spatially 
separated 17 , we then have A ~ k A l where A l is the area of each 
subcenter's attraction basin. This area A± is related to the average 
individual commuting distance by \f~A\~L tot j'P, and we obtain 



A- /c [-^1 =k£ 2 



This leads to expression for the number of centers 



(18) 



(19) 



which is always smaller than 1, also in agreement with the empirical 
results for US cities. We can now also compute the scaling of the 
surface area 



A 



(20) 



We further assume that L tot /P is a fraction of the longest possible 
journey i individuals can afford, that is to say 



L~l 



(21) 



It is important to note that if € c is independent from €, the quant- 
itative predictions of our model would still hold. The final expression 
for the area is then here given by 



e~{c) 



2 s 



(22) 



where 8 = 



/' 



2fi+l 



. The exponent <5 is smaller than 1/2 whatever fi 



0, which implies that the density of cities increases sublinearly with 
population. In other words, the density of cities increases with popu- 
lation. We verify this prediction in Table 2, with data about land area 
of urbanized areas in the US (Figure 2). We find 2(5 emp = 0.85 ± 0.01 
(95% CI) which is not too far from the theoretical value 2<5 ( j, = 0.64 ± 
0.12 (95% CI), equal to a in this case. 

Because the area of an urban system results from centuries of 
evolution, we do not a priori expect our model-where individual 
vehicles are assumed to be the only vector of mobility- to give a 
prediction valid for all countries and all times. Nevertheless, these 
results give us reasons to believe that the spatial structure of the 
journey-to-work commuting should still be the dominant factor in 
the dependence of land area on population. 

Total commuting distance. Using Eq. 6 and Eq. 22 we are now able to 
compute Lf 0 f Ua 
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Figure 2 | Mobility and city structure and their impact on agglomeration economies and diseconomies, (a) Variation of the daily total driven distance 
with the population for 441 urbanized areas in the US in 2010. The dashed line shows the power-law fit with exponent 0.595 ± 0.026 (t 2 = 0.90). (b) 
Variation of the land area with population for 3540 urbanised areas in the US in 2010. The fit assuming a power-law dependence gives an exponent 0.853 
±0.11 (r 2 = 0.93). Both exponents are smaller than 1, as predicted by our theory, (c) Variation of the total lane miles with population for 363 urbanised 
areas in the US. A power law fit (dashed line) gives L N I€ = f - 765±0 - 033 = 0.92). The sublinear behaviour -which agrees with our prediction-means that 
larger cities need to spend less in infrastructure per capita than smaller ones, (d) Variation of the total delay due to congestion with population for 
97 urbanised areas in the US. A power law fit gives an exponent 1.270 ± 0.067 (r 2 = 0.97). The superlinear behaviour agrees with the prediction given by 
our model and challenges the claims of sustainability of cities. 




We plot Lfot J s/A for urbanized areas in the US on Figure 2, and one 

can check in Table 2 that the exponent predicted from the previously 
measured value of a agrees well with the exponent measured on the 
data. In the fitting case, the exponent would simply be given by 1 — al 
2. 

Total length of roads. If we use the previously derived expression for 
the area A, we find 

L N ~eVp(^\ (24) 

The quantity <5 is less than 1/2, which implies that L N scales subli- 
nearly with the city's population size. In other words, larger cities 
need less roads per capita than smaller ones: we recover the fact that 



agglomeration of people in urban centers involves economies of scale 
for infrastructures. Within the fitting assumption (Eq.16), we would 
obtain (1 + a) 12. 

Total delay due to congestion. Unfortunately, agglomeration in cities 
does not only generate economies. Congestion, for instance, is a 
major diseconomy associated with the concentration of people in a 
given area. A simple way to quantify the impairement caused by 
traffic congestion is through the total delay it generates. If we make 
the first order approximation that the average free-flow speed v is the 
same for everyone, the total delay due to congestion is given -accord- 
ing to our model-by 

If we assume that all the centers share the same number of commu- 
ters -a reasonable assumption within our model 17 -we obtain 
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Si- 



Ltot 
v 



py 
k) 



(26) 



which, using the expressions for L tot and A given in Eq. 23 and Eq. 22 
respectively, gives 



<5t> 



£P 
v 



(27) 



The total commuting time corresponding to the same distance but 
without congestion scales as t 0 ~ L tot and thus less rapidly than the 
total delay which scales super-linearly with population (even when 
polycentricity is taken into account). This means that, for the largest 
cities, delays due to congestion actually dominate the time spent in 
traffic, and that economical losses per capita due to the time lost in 
congestion -and the corresponding strain on people's life- increase 
with the size of the city. 

In the fitting assumption Eq. 16, and using the same arguments for 
the calculation of Sz, we easily obtain for the exponent the value 

i+-^fi-fY 

H+l\ 2) 

Transport related C0 2 emission. Gasoline consumption. Another dis- 
economy associated with congestion is the quantity of C0 2 emitted 
by cars and the gasoline consumed by motor vehicles. This amount 
not only depends on the distance that has been driven, but also on the 
traffic during the journey. It indeed turns out that for the same length 
driven, a car burns more oil when the traffic is heavy than when the 
road is clear. Within our model, the presence of traffic is seen in the 
time spent to cover a given distance, and we write that the quantity of 
C0 2 emitted by a vehicle is proportional to the total time spent in 
traffic, leading to 



(28) 



where q is the average quantity of C0 2 produced per unit time. In the 
polycentric case with k = k(P) subcenters, the typical trip length dy is 
given by \J A/k and we obtain 

5" 



Q COl =qiP 



'+'7 



(29) 



The first term in brackets is a constant, and the quantity of C0 2 is 
thus dominated by congestion effects at large populations 



(30) 



and the total daily transport-related C0 2 emission per capita thus 
scales as 



Qco 2 



(31) 



The quantity of C0 2 emitted per capita in cities thus increases with 
the size of the city, a consequence of congestion. This prediction 
agrees with the exponent we measure (Figure 3) on data gathered 
for US and OECD cities (Data about the area and population of 
urbanised areas can be found on the Census Bureau website 38 , data 
about congestions in urban areas can be found in the Urban Mobility 
Report 39 , and data about the total lane miles and the daily total miles 
driven in urbanized areas can be found on on the Federal Highway 
administration website 40 ). We are aware that the scaling of C0 2 with 
population size is controversial, with results varying from one study 
to another. Although a systematic meta-analysis of these results is 
beyond the scope of this paper, we note that the authors of 27 are 
concerned with the total emissions of C0 2 , while this paper is only 



concerned with emissions due to transportations. Moreover, our 
prediction agrees well with the exponent of 1.33 measured by the 
authors of 6 on the same dataset, but with a different definition of 
cities. Finally, our prediction also agrees with measurements made 
in 28 for developing countries. 

Another important related quantity is the the consumption of 
gasoline which in principle is proportional to the emission of C0 2 
and the time spent driving. The total daily gasoline consumption is 
then given by 



<qtP 



(32) 



where q is the average quantity of gasoline needed per unit time. 
From this expression, we see that the total daily gasoline consump- 
tion per capita scales as 



(33) 



and is therefore not a simple function of the city density, in contrast 
with what was suggested by the seminal paper of Newman and 
Kenworthy 4 . At this stage however, more data about gasoline con- 
sumption is needed to test this prediction and draw definitive 
conclusions. 

Discussion 

Monocentric versus polycentric. Although polycentricity emerges 
naturally from our model as a result of congestion, many 
circumstances can prevent or foster the appearance of new activity 
centers in a city. There are many debates as to whether policies 
should favour polycentric or monocentric developpement of cities. 
Most of them are based on ideologies and opinions about how cities 
should be, very few are based on a quantitative understanding of the 
city as a complex system. Although this only represents a small part 
of the debate, our model allows to quantify the effect of polycentricity 
on the total delay due to congestion. 

We can indeed compute the total delay due to congestion in the 
case of a monocentric configuration. In this situation, all the popu- 
lation commutes to a single destination 1 and we have 



6t„ 



It follows, using the expression given above for Lf t 



8r„ 



From the fact that l + fi > 1 ■ 



} — - 1 



(34) 



(35) 



/' 



2/i+l 



, we indeed find that the total 



delay due to congestion is worse for monocentric cities than it is for 
polycentric cities with the same population, which agrees with the 
usual intuition. More precisely the ratio of delays is given by 

St /P\ 11 

u l mono I ] (36) 

where the exponent is of order /? ~ 0.57. Therefore, even though 
diseconomies associated with polycentric cities scale superlinearly 
with population, it would be even worse if we did not let cities evolve 
from the monocentric case. The same reasoning applies to the con- 
sumption of gasoline and the C0 2 emissions. This suggests that, 
everything else being equal, polycentricity should be favoured for 
quality of life and environmental reasons. 

Megapolis versus urban villages. Also, given the superlinear beha- 
viour of the diseconomies associated with living in cities, it is clear 
that we would be better off living in several smaller cities rather than a 
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Figure 3 | Variation of C0 2 emissions due to transport with city size. In blue, excess C0 2 (in tons) due to congestion, as given by the Urban Mobility 
Report (2010) for 101 metropolitan areas in the US 3 *. In green, we show the estimated C0 2 emissions (in tons) due to transports, as given by the OECD for 
268 metropolitan areas in 28 different countries (Data about the total C0 2 emissions due to transportation in major metropolitan area in the OECD can 
be found online 41 ). The dashed yellow lines represent the least-square fit assuming a power-law dependency with multiplicative noise, which gives 
respectively Q C o 2 ~P 1 ' 262±0 089 (r 2 = 0.94) for the US data and Q C o 2 ~P [ 212±0098 (r 2 = 0.83) for the OECD data. 



single huge city. However, due to the economies of scale realised in 
large cities, we can wonder whether this is also economically 
reasonable. If we assume that the total cost of a city of population 
P is the sum of its infrastructure cost and the economical losses due to 
congestion we have 

C T {P)=e,L N {P)+e c At5x{P) (37) 

where £/ is the average cost of a kilometer of roads, £c the average 
hourly wage and At the planning horizon in years (this expression is 
not exhaustive, as the costs dues to C0 2 emissions and gasoline 
consumptions are not included). The infrastructure needs 
maintenance, and its cost depends on the planning horizon as well 
and can be written £/ = £b + At £m where £5 is the construction cost 
in $/km and £m the maintenance cost in $/km/year. 

We assume that the population P is distributed among n cities of 
the same size Pin (see Figure 4). The total lane miles for the n cities 

reads 1$ (P) = nL N (P/n) where L N (P) ~ l\[P (*-) is the total lane 



for one city. The total congestion delay for n cities is 8x n = nSx(P/n) 
and we thus obtain the total cost Cr(P, n) for n cities 



C T {P,n) = n- s 



-X £ C At 



(38) 



dC 



when — = 0, leading to (for A^l) 
an 



,=P 



2(5 \ e c x 



(39) 



(the actual number of cities is of course an integer, and can be taken 
as the nearest integer from n min for instance). It is then economically 
advantageous to divide the population in several cities if n min > 2. To 



illustrate this point, we compute the number of cities which would 
minimise the cost for a world population P ~ 10*. The World Bank 
estimates the maintenance cost of roads to be of the order of 
£M~10 5 $/fcm/^ear, and the average hourly wage to be of the order 
of £c~10$/fc, the value of S is taken from the measures on US data, 
5 ~ 0.27, and x/£ ~ 10 kmlh. We then obtain 



, = 180 



(40) 



which gives an average city size of Pin ~ 5, 500, 000. This result is to 
put in perspective with the fact that the world hosts 40 or so cities 
with over 5, 500, 000 inhabitants and that this number is still 
increasing. 

The most economical population distribution. The previous 
results assume that we split a large city into many cities of the 
same size. The cities are however organized in various sizes 
distributed according to something that can be approximated by a 
Pareto distribution, as known since Zipf s work 29 . It is still unclear 
why we observe such a convergence 30,31 . We propose here a new 
perspective to this debate by asking: Assuming cities are 
distributed according to a Pareto distribution, what value of the 
exponent minimises the overall cost? Indeed from above the total 
cost for a population size x is given at large times by 



C T (x)=e M At L N (x)+e c At 8x(x) 



(41) 



We assume that the population is distributed according to 

V[x) = [y-l)x~ Y for x e [1, A] (42) 

with y > 1 and a cut-off population A^>1 (which is at most equal to 
the world's population). The average cost is then given by 

CV= f T{x)C T (x) dx (43) 



SCIENTIFIC REPORTS | 4:5561 | DOI: 1 0. 1 038/srep05561 



7 



One large city (P) 






n smaller cities (P/n) 




ten 
te; 



Figure 4 | Scaling down. We consider a population P and see how indicators change when we compare it with a system with many cities and the same total 
population, (a) Variation of the yearly delay per capita due to congestion with the number of cities (normalised by the value <5t(1) corresponding to the 
single city case), (b) Evolution of the infrastructure length with the number of cities (normalised by the value L N {1) corresponding to the single city case). 
Relative gains in terms of commuting time per person decrease faster than infrastructure costs increase, suggesting that life in cities could be 
improved at a relatively low cost by decentralisation. 



leading to 



Ate 



(y-i) 



->' + £>+§ v -y+d+2 



(44) 



The only consistent solution is obtained for y<5 + 2, The dominant 
term for A3>1 is given by 

^ A-y+ s + 2 ' 

-y+d+2 



Of 



Ate c £ 



(y-i) 



(45) 



The optimal power law distribution minimizes the average cost and 



is such that 



dC r 
dy 



= 0. We obtain the following equation 
= (y-l)lnA 



l + S 
5+2-y ' 

and in the limit A^l we obtain the optimal value for y 

1 

IriA 



y* = 2 + 8 - 



(46) 



(47) 



Numerically, S ~ 0.32 and A ~ 10 9 , leading to y* ~ 2.27. It is 
interesting to note that this value would lead to a rank-plot 
exponent (=0.78) not far from those measured on different 
countries around the world 32 . Although we do not pretend that the 
above reasoning provides a definitive answer to the Zipf puzzle, it 
nevertheless suggests that the broad diversity of population might 
derive from economical considerations, and that there may be a 
connection between the Zipf law exponent and optimality 
considerations. 



Outlook. The superlinear increase of congestion delay with 
population, and thereby of gasoline consumption and of C0 2 
emissions, has terrible consequences on the economy, the 
environment, health and well-being. The outlook is nothing short 
of grim in our ever-urbanising world. As the proportion of human 
beings living in cities dramatically increases -the UN expects the 
world population to be 67% urban in 2050 33 - wages are likely to 
increase 7 but not enough to compensate for the negative effects of 
congestion. As a result, if the individual car stays the dominant 
transportation mode, cities will put more strain on people's life, 
while acting as catalysts for the production of C0 2 greenhouse gas, 
responsible for an overall increase of the planet's temperature 34 . It is 
currently believed that advantages associated with living in a large 
city outweigh the costs. Our results reveal however the existence of 
very rapidly growing problems such as congestion and C0 2 
emissions, which inevitably begs the question of the sustainability 
of large cities. It might be time to cut down considerably the use of 
individual vehicles, or to consider the possibility of living in smaller 
or medium sized cities: the infrastructure costs (L N ) may be larger, 
but the impact on the environment (C0 2 emissions) and on the well- 
being of people (delays in congestion) would be beneficial (see 
Figure 3). 

The most striking fact about the above results is that despite the 
apparence of complexity that is conveyed by cities, most of their 
structure can be explained by the very simple and universal desire 
for the best achievable balance between income and commuting 
costs. Our model unifies mobility patterns, spatial structure of cities 
and allometric scalings in a framework that can be built upon. More 
work is needed in order to integrate information about firm loca- 
tions, the influence of public transportation on mobility patterns 35 , 
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the effect of the integration of cities into urban systems 36 , to under- 
stand the fluctuations around the average trends, and to test the 
validity of the model on different sets of data. We believe however 
that the results presented here represent a crucial step towards a 
scientific understanding of cities. 

Methods 

Data. As recently stressed in 37 , when trying to identify patterns accross cities, one 
must be careful and consistent in the definition of city boundaries. These authors 
indeed found out that the scaling exponents measures for several quantities are 
usually sensitive to the definition chosen for the city. In order to make our results 
reproducible, we detail in the following the data source and the corresponding city 
definition. 

Total distance driven and lane miles. The daily commuting vehicle-miles as well as the 
total lane miles data were obtained for the year 2011 from the Federal Highway 
Administration for 441 Urban Areas {as defined by the Census Bureau) in the US. 

Area. The surface area data were obtained for the year 2010 from the Census Bureau 
for 3540 Urban Areas {as defined by the Census Bureau) in the US. It is interesting to 
note that the dependence of the surface area of Metropolitan Statistical Areas with 
population is a lot less clear-cut, implying that, with respect to surface area, the 
definition of UAs delineates more coherent systems than the definition of MSAs. 

Values of t. In order to compute a value for t — - , we use for 5 the average wage at the 
county level, provided by the Bureau of Labor Statistics. For t, the transportation cost 
per unit distance, we use the average gas price per state as given by the U.S. Energy 
Information Administration, and assume that all vehicles burn the same quantity of 
gas per unit distance on average. Interestingly, while we have assumed a constant € 
throughout this paper, we have noticed that its effect on the different scalings was not 
negligible {Compare the results for L N and A between Table 1 and Table 2 for 
instance), implying that € has a small, yet non-zero dependence on the population. 
This probably comes from the dependence of the average wage on population 7 . We 
leave the investigation of this dependence for further studies. 

Total delay and C0 2 emissions. The excess C0 2 and the total delay due to traffic 
congestion were obtained for the year 2012 from the Urban Mobility Report for 97 
Urban Areas in the US. Also, the quantity of C0 2 emissions due to transportation was 
obtained from the OECD for 268 metropolitan areas accross 28 countries for the year 
2008. It is worth noting here that the US definition of Urban Area and the OECD 
definition of Metropolitan Area are qualitatively different, added to the fact that 
OECD data cover many different countries. Yet, the measured values of the exponent 
are compatible with each other. 

As far as the United States are concerned, we present results for Urban Areas only. 
Indeed, when data were available for both MSA and Urban Area, we found out that 
the MSA data did not exhibit as clear-cut regularities as the Urban Area data did. We 
believe that this effect is due to the lack of a unique, quantitative definition of a city 
which makes In this work, we assumed that Urban Areas designate areas which are 
coherent with respect to the quantities we are measuring, and leave the crucial issue of 
city definition for further studies. 
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